Global options for outputs/formatting
Plain-text-formatting syntax allows for conversion to multiple document types
knit button or command/CTRL + shift + KRender files in the console with rmarkdown::render(file, output_format) (this is what knit is doing)
Option to create PDF from HTML with pagedown::chrome_print(file)
\(\LaTeX\) templates (pay attention to $ in templates)
Word Document: use the Styles Pane and “Update to Match Selection”
These files go in the same place as your .Rmd
Example from my CV with both template-specific YAML parameters and YAML references
Examples of calling packages in the YAML header and using inline functions from my stats homework (also see Writing Your Thesis with R Markdown)
Stats homework output
Examples from my Psychonomics poster
To make extremely custom edits to templates, sometimes you have to edit the template documents
# Make dataframe with installed packages
pkgs <- installed.packages() %>%
as.data.frame()
# Pull posterdown package
pstr <- pkgs %>%
select(Package, LibPath, Version, Depends, Imports) %>%
dplyr::filter(Package == "posterdown")
# Make table
kable(pstr) %>%
kable_styling(bootstrap_options = "condensed",
font_size = 18)| Package | LibPath | Version | Depends | Imports |
|---|---|---|---|---|
| posterdown | /Library/Frameworks/R.framework/Versions/3.5/Resources/library | 1.0 | NA | pagedown, rmarkdown, yaml |
Knitting the vitae::awesomecv template created a .cls file that I could edit to change font sizes/colors
:: for unloaded packages and conflicting functionsWarnings won’t stop your document from compiling, but generally indicate that you should change something in your code
Chunk error
Markdown/YAML error
Running a chunk executes the code in the console and adds the output to your R environment; however, your R environment is separate from the environment created when “knitting” a document
# Define new variable y
y <- 100
# When I run this chunk, I get the expected output (150),
# but it fails when I try to knit the document
print(x + y)rm(ls = list())?packageStack Overflow is your friend!
scholar package for automatically downloading citations from Google Scholardevtools package for installing packages/plug-ins from GitHub (e.g., papaja)rmarkdown) in library with update.packages(path)install.packages(package)updateR packagetidyr functionsThe way R stores your information will determine the kinds of functions/operators you can use
Functions take a certain number and certain types of “arguments”
base R functions: part of downloading Rinstall.packages(package) once (or to update package)library(package) every R sessionUse Help window or ?package to check argument names, types, and defaults
base R## [1] 3
# If I give it the wrong kind of argument, it will just return NULL
# Some functions won't run at all with wrong kind of argument
nrow(a_list)## NULL
# scholar package
library(scholar)
# get_publications function
# Pull publications from Google Scholar for Marie Curie
get_publications("EmD_lTEAAAAJ&EmD_lTEAAAAJ&") %>%
dplyr::filter(cites > 30) %>%
distinct(title, .keep_all = TRUE) %>%
select(author, title) %>%
head(2) %>%
kable()| author | title |
|---|---|
| P Curie, M Sklodowska-Curie | Sur une substance nouvelle radio-active, contenue dans la pechblende |
| E Curie | Madame Curie: a biography |
is.na(), exists(), etc. will return TRUE/FALSE valuesgrep(), filter(), str_detect(), etc. use TRUE/FALSE valuesGeneral parameters for csv files
read.csv("file_name.csv",
header = TRUE,
stringsAsFactors = FALSE,
check.names = FALSE,
na.strings = "")Avoid special characters (including spaces) in file names, directories, and column headers!
readxl package for Excel spreadsheetsqualtRics package for Qualtrics dataggmap package for Google services (geolocation data)read_table() from readr package for text filestidyr%>%: pass the results of one function on to anotherselect(): choose columns by namemutate(): add/change columnsfilter(): filter for (or out) rowsgroup_by() and summarise(): perform operations on groups of datagather() and spread()pivot_longer() and pivot_wider(): condense multiple columns into one or the inverseseparate() and unite(): split a column into multiple or the inversetidyr functionsslice(): choose a rowpull(): choose a columnselect() (e.g., contains())join family of functions: combine datasets based on a shared unique identifierunion(): combine datasets by rows (column names must be the same)replace_na()/drop_na(): alter/remove rows with NA valuesbase R functionsrbind() and cbind(): add rows/columnsnrow() and ncol(): count rows/columnsunique(): pull unique valuesvar$column and var[row, column]which() with column/row indexingtibble package for dataframes with tibble()kableExtra() for kable() tablesggplot2 package for graphs (cheat sheet here)factor() for ordering text labels in graphsna.rm = TRUE argument (e.g., in mean()) to remove NA values from calculationsgrep(), agrep()grepl(), agrepl(), str_detect()sub(), gsub(), replace()regexpr(), gregexpr(), regexec()perl = TRUE argument to handle especially complex patterns# List of elements
fruit <- c("apple", "banana", "pear", "pinapple")
# grep position
grep(pattern = "le", x = fruit)## [1] 1 4
## [1] "apple" "pinapple"
## [1] 1 3 4
# regexpr
# match.length attribute gives starting position of match
# index.type attribute gives length of matched text
regexpr(pattern = "le", text = fruit)## [1] 4 -1 -1 7
## attr(,"match.length")
## [1] 2 -1 -1 2
## attr(,"index.type")
## [1] "chars"
## attr(,"useBytes")
## [1] TRUE
# Variables
vals <- rep(1:3, 3)
name <- "assign_example"
# Assign values to variable name
assign(name, vals)
# Use the variable as usual
assign_example## [1] 1 2 3 1 2 3 1 2 3
## [1] 1 2 3 1 2 3 1 2 3
# You can add to this variable dynamically as well
assign(name, c(get(name), 4:6))
# New output
assign_example## [1] 1 2 3 1 2 3 1 2 3 4 5 6
apply(), lapply(), sapply(), tapply()source() variables from R scripts%notin% and %in% (compared to != and ==)# Example using apply(): go across columns of dataset
# and substitute characters
language <- apply(language, 2,
function(x) gsub("\\\\", "", x, fixed = TRUE))
# Source other sripts
source("data_cleaning.R", local = TRUE)
# Helper function
"%notin%" <- Negate("%in%")
# Example from processing pipeline for Qualtrics data
unusable <- c("0","00","107")
dat %>% dplyr::filter(Progress==100 & ID %notin% unusable)ggplot2 parameters in a list()list() vs. c()